Keys Per Processor ( n / p ) Radix SortBitonic Sort Sample Sort Simple Radix Sort
نویسندگان
چکیده
We have developed a methodology for predicting the performance of parallel algorithms on real parallel machines. The methodology consists of two steps. First, we characterize a machine by enumerating the primitive operations that it is capable of performing along with the cost of each operation. Next, we analyze an algorithm by making a precise count of the number of times the algorithm performs each type of operation. We have used this methodology to evaluate many of the parallel sorting algorithms proposed in the literature. Of these, we selected the three most promising, Batcher's bitonic sort, a parallel radix sort, and a sample sort similar to Reif and Valiant's ashsort, and implemented them on the Connection Machine model CM-2. This paper analyzes the three algorithms in detail and discusses the issues that led us to our particular implementations. On the CM-2 the predicted performance of the algorithms closely matches the observed performance, and hence our methodology can be used to tune the algorithms for optimal performance. Although our programs were designed for the CM-2, our conclusions about the merits of the three algorithms apply to other parallel machines as well.
منابع مشابه
Partitioned Parallel Radix Sort
Load balanced parallel radix sort solved the load imbalance problem present in parallel radix sort. By redistributing the keys in each round of radix, each processor has exactly the same number of keys, thereby reducing the overall sorting time. Load balanced radix sort is currently known as the fastest internal sorting method for distributed-memory multiprocessors. However, as the computation ...
متن کاملA Fast Radix Sort
Almost all computers regularly sort data. Many different sort algorithms have therefore been proposed, and the properties of these algorithms studied in great detail. It is known that no sort algorithm based on key comparisons can sort N keys in less than O(N\og/V) operations, and that many perform 0(N) operations in the worst case. The radix sort has the attractive feature that it can sort N k...
متن کاملConscious Radix Sort
The exploitation of data locality in parallel computers is paramount to reduce the memory traac and communication among processing nodes. We focus on the exploitation of locality by Parallel Radix sort. The original Parallel Radix sort has several communication steps in which one sorting key may have to visit several processing nodes. In response to this, we propose a reorganization of Radix so...
متن کاملThe Effect of Local Sort on Parallel Sorting Algorithms
We show the importance of sequential sorting in the context of in memory parallel sorting of large data sets of 64 bit keys. First, we analyze several sequential strategies like Straight Insertion, Quick sort, Radix sort and CC-Radix sort. As a consequence of the analysis, we propose a new algorithm that we call Sequential Counting Split Radix sort, SCS-Radix sort. SCS-Radix sort is a combinati...
متن کاملFault-tolerant sorting in SIMD hypercubes
This paper conszders sortang a n SIMD hypercube nrultaprocessors En the presence o f node fazlures. The proposed algorzthm correctly sorts u p t o 2n = N keys 21) a faulty SIMD hyperrube of dtmt nsaon n containzng u p to n 1 faulty n0dt.s The proposed fault-tolerant abgonfhm employs radix sort. W P u s e a pazr of j l ood dcmensaons whach help+ t o route daia around the faulty prsocessors durtn...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998